BIOTECHNOLOGY 
SYSTEMS 
BRANCH 



JRAW SEQUENCE LISTING 
ERROR REPORT 




The Biotechnology Systems Branch of the Scientific and Technical Information 
Center (STIC) detected errors when processing the following computer readable 
form: 



Application Serial Number: 
Source:. 

Date Processed by STIC: 




THE ATTACHED PRINTOUT EXPLAINS DETECTED ERRORS. 

PLEASE FORWARD THIS INFORMATION TO THE APPLICANT BY EITHER: 

1 ) INCLUDING A COPY OF THIS PRINTOUT IN YOUR NEXT COMMUNICATION TO THE 
APPLICANT, WITH A NOTICE TO COMPLY or, 

2) TELEPHONING APPLICANT AND FAXING A COPY OF THIS PRINTOUT, WITH A 
NOTICE TO COMPLY 

FOR CRF SUBMISSION AND PATENTIN SOFTWARE QUESTIONS, PLEASE CONTACT 
MARK SPENCER, 703-308-4212. 



TO REDUCE ERRORED SEQUENCE LISTINGS, PLEASE USE THE CHECKER 
VERSION 4.1 PROGRAM , ACCESSIBLE THROUGH THE U.S. PATENT AND 
TRADEMARK OFFICE WEBSITE. SEE BELOW FOR ADDRESS: 

http:/ /v vww.uspto.gov/web/offices/pac/checker/chkr41note.htm 



Applicants submitting genetic sequence information electronically on diskette or CD-Roni should be aware that there is 

a possibility that the disk/CD-Rom may have been affected by treatment given to all incoming mail. 

Please consider using alternate methods of submission for the disk/CD-Rom or replacement disk/CD-Rom. 

Any reply including a sequence listing in electronic form should NOT be sent to the 20231 zip code address for the 

United States Patent and Trademark Office, and instead should be sent via the following to the indicated addresses: 

1. EFS-Bio (<lit tp://www.uspto.gov/ebc/efs/downloads/documents.htm> , EFS Submission 
User Manual - ePAVE) 

2. U.S. Postal Service: Commissioner for Patents, P.O. Box 1450, Alexandria, VA 22313-1450 

3. Hand Carry directly to (EFFECTIVE 12/01/2003): 

U.S. Patent and Trademark Office, Box Sequence, Customer Window, Lobby, Room 1B03, Crystal Plaza Two, 
2011 South Clark Place, Arlington, VA 22202 ' . s 

4. Federal Express, United Parcel Service, or other delivery service to: U S. Patent and Trademark Office. 
Box Sequence, Room 1B03-Mailroom, Crystal Plaza Two, 201 1 South Clark-PJace, Arlington, VA 22202 

' - • 

Revised 10/08/2003 



Raw Sequence Listing Error Summary 



E RROR DETECTED SUGGESTED CORRECTION SERIAL NUMBER: 

ATTN: NEW RULES CASES: PLEASE DISREGARD ENGLISH "ALPHA" HEADERS, WHICH WERE INSERTED BY PTO SOFTWARE 

I Wrapped Nucleics The number/text at the end of each line "wrapped" down to the next line. This may occur if your file 

Wrapped Aminos was retrieved in a word processor after creating it. Please adjust your right margin to .3; this will 
prevent "wrapping." 
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Invalid Line Length The rules require that a line not exceed 72 characters in length. This includes white spaces. 
Misaligned Amino The numbering under each 5 th amino acid is misaligned. Do not use tab codes between numbers; 



Numbering use space characters, instead. 

Non-ASCII The submitted file was not saved in ASCII(DOS) text, as required by the Sequence Rules. Please 

ensure your subsequent submissions saved in ASCII text. 

-* 

Variable Length Sequence(s) 



_PatentIn 2.0 
"bug" 



Skipped Sequences 
(OLD RULES) 



contain n's or Xaa's representing more than one residue. Per Sequence Rules, 
each n or Xaa can only represent a single residue. Please present the maximum number of each 
residue having variable length and indicate in the j <220>-<223> section that some may be missing. 

A "bug" in Patcntln version 2.0 has caused. the <220>-<223> section to be missing from amino acid 

sequences(s) ■ . Normally, Patcntln would automatically generate this section from the 

previously coded nucleic acid sequence. Please manually copy the relevant <220>-<223> section to 
the subsequent amino acid sequence. This applies to the mandatory <220>-<223> sections for 
Artificial or Unknown sequences. 



Sequcncc(s) 



missing. If intentional, please insert the following lines for each skipped sequence: 



(2) INFORMATION FOR SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 
(i) SEQUENCE CHARACTERISTICS: (Do not insert any subheadings under this heading) 

(xi) SEQUENCE DESCRIPTION:SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 
This sequence is intentionally skipped 

Please also adjust the "(ii) NUMBER OF SEQUENCES:" response to include the skipped sequences. 

missing. If intentional, please insert the following lines for each skipped sequence. 



_Skipped Sequences Sequence(s) __ 
(NEW RULES) <210> sequence id rjumber 
<400> sequenc« id number 
000 



'7 

10^ 



Use of n's or Xaa's 
(NEW RULES) 



Jnvalid <213> 
Response 

Use of <220> 



' Patentln 2.0 
"bug- 



Use of n's and/or Xaa's have been detected in the Sequence Listing. 

Per 1.823 of Sequence Rules, use of <220>-<223> is MANDATORY if n's or Xaa's are present. 

In <220> to <223> section, please explain location of n or Xaa, and which residue n or Xaa represents. 

Per 1.823 of Sequence Rules, the only valid <2I3> responses are: Unknown, Artificial Sequence, or 
scientific name (Genus/species). <220>-<223> section is required when <213> response is Unknown or 
is Artificial Sequence 

Sequence(s) missing the <220> "Feature" and associated numeric idehtifiers and responses. 

Use of <220> to <223> is MANDATORY if <213> "Organism" response-is "Artificial Sequence" or 
"Unknown." Please explain sqiyce of genetic material in <220>*to <223> section. 
(See "Federal Register," 00/01/1998, Vol. 63, No, 104, pp. 2963J02)-n (Sec. 1.823 of Sequence Rules) 
• . *• 

Please do not use "Copy to Disk" function of Patentln version 2.0. This causes a corrupted file, • * 
resulting in missing mandatory numeric identifiers and responses (as indicated on raw sequence 
listing). Instead, please use,'JFile.Manager" or any other manual means to copy file to floppy disk. 
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. Misuse of n/Xaa "n" can only represent a single nucleotide ; "Xaa" can only represent a single amino acid 



AMC - Biotechnology Systems Branch - 09/09/2003 



Page 1 of 4 




RAW SEQUENCE LISTING DATE: 10/07/2003 

PATENT APPLICATION: US/10/668 , 749 TIME: 09:18:34 

Input Set : A:\50112-1580.txt 

Output Set: N:\CRF4\10072003\J668749.raw 

3 <110> APPLICANT: Agilent Technologies 

5 <120> TITLE OF INVENTION: Methods and Systems for Nanopore Data Analysis 
7 <130> FILE REFERENCE: 50112-1580 
C — > 9 <140> CURRENT APPLICATION NUMBER: US/10/668,749 

10 <141> CURRENT FILING DATE: 2003-09-23 

12 <160> NUMBER OF SEQ ID NOS : 8 

14 <170> SOFTWARE: Patentln version 3.2 

16 <210> SEQ ID NO: 1 v /» /J a /> a w% a* 

17 <211> LENGTH : 1300 LmLJ^J fJ^W&W 11068 Not UOitipiy 

is <2i2> type: ffofri Jo cfa Corrected Diskette Needea 

19 <213> ORGANISM :\neucleotides } A-^r . - SJ ^ 

21 <400> SEQUENCE: 1 ' lAAJK y^OA^rO^ 

22 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 60 
24 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 120 
26 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 180 
28 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 240 
30 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 300 
32 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 360 
34 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 420 
36 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 480 
38 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 540 
40 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 600 
42 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 660 
44 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 720 
4 6 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 780 
48 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 840 
50 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 900 
52 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 960 
54 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1020 
56 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1080 
58 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1140 
60 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1200 
62 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1260 
64 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1300 

67 <210> SEQ ID NO: 2 

68 <211> LENGTH: 500 

69 <212> TYPE: DNA 

70 <213> ORGANISM^n • ; 

72 <400> SEQUENCE: 

73 cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc 60 
75 cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc 120 
77 cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc 180 
79 cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc cccccccccc 240 




file://C:\CRF4\Outhold\VsrJ668749.htm 
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RAW SEQUENCE LISTING DATE: 

PATENT APPLICATION: US/10/668,74 9 TIME: 

Input Set : A:\50112-1580.txt 

Output Set: N:\CRF4\10072003\J668749.raw 



10/07/2003 
09:18:34 



81 
83 
85 
87 
89 
92 
93 
94 
95 



cccccccccc 
cccccccccc 
cccccccccc 
cccccccccc 



cccccccccc cccccccccc 
cccccccccc cccccccccc 
cccccccccc cccccccccc 
cccccccccc cccccccccc 



cccccccccc cccccccccc 
cccccccccc cccccccccc 
cccccccccc cccccccccc 
cccccccccc cccccccccc 



cccccccccc 
cccccccccc 
cccccccccc 
cccccccccc 



cccccccccc cccccccccc 
<210> SEQ ID NO: 3 
<211> LENGTH: 70 
<212> TYPE: DNA^ 
<213> ORGANISMS neueTleotides 





97 <400> SEQUENCE 

98 ccacaaacaa acaaccacac aaacacacaa ccacaacacc aacacacaaa caaaccaaca 
100 cacaaactcc 

103 <210> SEQ ID NO: 4 

104 <211> LENGTH: 

105 <212> TYPE: DNA 

106 <213> ORGAN ISM r^- u Q-Te o t i d e s 

108 <400> SEQUENCE 

109 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaa 

112 <210> SEQ ID NO: 5 

113 <211> LENGTH: 196 

114 <212> TYPE: DNA 

115 <213> ORGANISMf r • 

117 <400> SEQUENCE 

118 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 



120 
122 
124 
127 
128 



aaaaaaaaaa aaaaaaaaaa 
aaaaaaaaaa aaaaaa 
<210> SEQ ID NO: 6 
<211> LENGTH: 48 



aaaaaaaaaa 



aaaaaaaaaa aaaaaaaaaa 
aaaaaaaaaa aaaaaaaaaa 
aaaaaaaaaa aaaaaaaaaa 



aaaaaaaaaa 
aaaaaaaaaa 
aaaaaaaaaa 




129 <212> TYPE: DNA, 

130 <213> ORGANIgff: nei i des 

132 <400> SEQUENCE: 6 

133 caaacaaacc aacacacaaa ctcccctcaa 

136 <210> SEQ ID NO: 7 

137 <211> LENGTH: 100 

138 <212> TYPE: DNA 

139 <213> ORGANISM: neucLer6tides 
141 <400> SEQUENcW_7 
142 
144 
147 
148 
149 
150 

152 <400> SEQUENCE: 



acacacaacc aaacaaac 




aaaaaaaaaa 
aaaaaaaaaa 



aaaaaaaaaa aaaaaaaaaa 
aaaaaaaaaa aaaaaaaaaa 
<210> SEQ ID NO 
<211> LENGTH: 
<212> TYPE: D 

<213> ORGAN I ^1: neucleotides 
-& 



aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 
aaaaaaaaaa 




153 ccacaaacaa acaaccacac aaacacacaa 
155 cacaaactcc tatagtgagt cgtatta 



ccacaacacc aacacacaaa caaaccaaca 



300 
360 
420 
480 
500 



60 
70 



48 



60 
120 
180 
196 



48 



60 
100 



60 
87 
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VERIFICATION SUMMARY DATE: 10/07/2003 

PATENT APPLICATION: US/10/668,749 TIME: 09:18:35 

Input Set : A:\50112-1580.txt 

Output Set: N:\CRF4\10072003\J668749.raw 

L:9 M:270 C: Current Application Number differs, Replaced Current Application Number 
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